Randomized partition trees for exact nearest neighbor search

نویسندگان

  • Sanjoy Dasgupta
  • Kaushik Sinha
چکیده

The k-d tree was one of the first spatial data structures proposed for nearest neighbor search. Its efficacy is diminished in high-dimensional spaces, but several variants, with randomization and overlapping cells, have proved to be successful in practice. We analyze three such schemes. We show that the probability that they fail to find the nearest neighbor, for any data set and any query point, is directly related to a simple potential function that captures the difficulty of the point configuration. We then bound this potential function in two situations of interest: the first, when data come from a doubling measure, and the second, when the data are documents from a topic model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Randomized Partition Trees for Nearest Neighbor Search

Randomized partition trees have recently been shown to be very effective in solving nearest neighbor search problem. In spite of enjoying strong theoretical guarantee, it suffers from high space complexity, since each internal node of the tree needs to store a d dimensional projection direction leading to aO(nd) space complexity for a dataset of size n. Inspired by the fast Johnson-Lindenstraus...

متن کامل

Fast £1-norm Nearest Neighbor Search Using A Simple Variant of Randomized Partition Tree

For big data applications, randomized partition trees have recently been shown to be very effective in answering high dimensional nearest neighbor search queries with provable guarantee, when distances are measured using 2 norm. Unfortunately, if distances are measured using 1 norm, the same theoretical guarantee does not hold. In this paper, we show that a simple variant of randomized partitio...

متن کامل

Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration

For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with onl...

متن کامل

Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?

Recent theory work has found that a special type of spatial partition tree – called a random projection tree – is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination of theory and experiments, for a broader class of trees that includes k-d trees, dyadic trees, and PCA trees. Our motivation is to get a feel for (i) the ki...

متن کامل

Fast Nearest Neighbor Search in SE(3) for Sampling-Based Motion Planning

Nearest neighbor searching is a fundamental building block of most sampling-based motion planners. We present a novel method for fast exact nearest neighbor searching in SE(3)—the 6 dimensional space that represents rotations and translations in 3 dimensions. SE(3) is commonly used when planning the motions of rigid body robots. Our approach starts by projecting a 4-dimensional cube onto the 3-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013